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Recent empirical studies have demonstrated long-memory in the signs of orders to buy or sell in 
financial markets |3.[l9|. We show how this can be caused by delays in market clearing. Under the 
common practice of order splitting, large orders are broken up into pieces and executed incrementally. 
If the size of such large orders is power law distributed, this gives rise to power law decaying 
autocorrelations in the signs of executed orders. More specifically, we show that if the cumulative 
distribution of large orders of volume v is proportional to v~'' and the size of executed orders is 
constant, the autocorrelation of order signs as a function of the lag r is asymptotically proportional 
tor"<°~^'. This is a Ion g-memory process when a < 2. With a few caveats, this gives a good match 
to the data. A version of the model also shows long-memory fluctuations in order execution rates, 
which may be relevant for explaining the long-memory of price diffusion rates. 
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I. INTRODUCTION 

A random process is said to have long-memory if it has 
an autocorrelation function that is not integrable. This 
happens, for example, when the autocorrelation function 
decays asymptotically as a power law of the form t~'' 
with 7 < 1. This is important because it implies that 
values from the distant past can have a significant effect 
on the present, that the stochastic process lacks a typical 
time scale, and implies anomalous diffusion in a stochas- 
tic process whose increments have long-memory. Exam- 
ples of long-memory processes and anomalous diffusion 



have been observed in many physical, biological and eco- 
nomic systems ranging from turbulen ce |26!| to chaotic 
dynamics due to flights and trapping J^, dynamics of 
aggregates of amphiphilic molecules |23| and DNA se- 
quences '53, '2^. In finance the volatility, roughly defined 
as the diffusion rate of price fiuctuations, is known to be 
a long- memory process 3, 9| • In this paper we analyze a 
mechanism for creating a long-memory process, based on 
converting a static power law distribution into a random 
process with a power law autocorrelation function. Other 
examples of stochastic processes relating power laws to 
long-memory have been given by Mandelbrot j2lj | (an- 
alyzed by Taqqu and Levy HJ), and in the context of 
DNA sequences by Buldyrev et al. j^. 

Recently a new long-memory property of the order 
fiow in a financial market was independently observed 
by Bouchaud et al. in the Paris Stock Exchange and 
Lillo and Farmer in the London Stock Exchange (LSE) 
(19j |. These studies have shown that there is a remark- 
able persistence in buying vs. selling. Labeling the signs 
of trading orders as ±1 according to whether they are to 
buy or to sell, the autocorrelation of observed order signs 
is strongly positive, asymptotically decaying roughly as a 
power law t~'^ , where 7 « 0.6. Such positive autocorre- 
lations can be measured at statistically significant levels 
over time lags as long as two weeks. 

For example, in Fig. ^ we show the empirical autocor- 
relation function of the time series of signs of orders that 
result in immediate trades for the stock Shell. The au- 
tocorrelation function is well described by a power law 
decay over almost three decades and a least squares fit 
to this gives 7 = 0.53. The fact that 7 < 1 implies 
that this is a long-memory process, i.e. its autocorrela- 
tion function decays so slowly that it is not integrable. 
This is important because it implies that values from 
the distant past have a significant effect on the present. 
A diffusion process built from long-memory increments 
has a variance cr^ that grows in time as o'^(t) ^ t^^, 
where is called the Hurst exponent. For < 7 < 1, 

= 1 — 7/2. For a normal diffusion process H — 1/2, 
but when H > 1/2 the variance grows faster than r^/^, 
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which is called anomalous diffusion. Another important 
consequence is that statistical averages converge slowly, 
e.g. the mean of a quantity that displays anomalous dif- 
fusion converges as T~^^~^\ where T is the sample size. 
The signs of orders in the LSE have been shown to pass 
tests for long-memory with a high degree of statistical 
significance p^ . 

From an economic point of view this is important be- 
cause of its implications for market efficiency. All other 
things being equal, since buy orders tend to drive the 
price up and sell orders tend to drive them down, this 
would imply that it was possible to make profits using a 
simple linear model to predict future price moments. In 
order to prevent this the market has to make substan- 
tial compensating adjustments 0, The difficulty 
of making such adjustments perfectly may have impor- 
tant implications about the origin of long-memory in the 
volatility of prices. 

In this paper we hypothesize that the cause of the long- 
memory of order flow is a delay in market clearing. To 
make this clearer, imagine that a large investor like War- 
ren Buffet decides to buy ten million shares of a company. 
It is unrealistic for him to simply state his demand to the 
world and let the market do its job. There are unlikely 
to be sufficient sellers present, and even if there were, 
revealing a large order tends to push the price up. In- 
stead he keeps his intentions as secret as possible and 
trades the order incrementally over an extended period 
of time, possibly through intermediaries. In a study of 
this phenomenon, about a third of the dollar value of 
such institutional trades took more than a week to com- 
plete This conflicts with standard neoclassical eco- 
nomic models, which assume market clearing, i.e. that 
the price always adjusts so that supply and demand are 
evenly matched. The fact that large orders are kept se- 
cret and executed incrementally implies that at any given 
time there may be a substantial imbalance of buyers and 
sellers, which can be interpreted as a failure of market 
clearing. Supply and demand do not match, and the mar- 
ket fails to clear. Effective market clearing is delayed, by 
variable amounts that depend on fluctuations in the size 
and signs of unrevealed orders. 

We propose a simple model to explain the long-memory 
of order flow based on delays in market clearing. We pos- 
tulate that unrevealed hidden orders are distributed ac- 
cording to a power law. These are broken up into pieces, 
which we call revealed orders, that are submitted at a 
steady rate. We show that this leads to long-memory in 
order flow, yielding a model consistent with empirical ob- 
servations. The main result is an analytic computation 
relating the exponent of the power law of the volume 
distribution of hidden orders to the rate of decay of the 
long-memory process characterizing revealed orders. 

The paper is organized as follows: In Section II we de- 
fine the two models that we study here, which we call the 
fixed N model and the A model. In Section III we analyt- 
ically compute the autocorrelation function of revealed 
orders for the fixed N model in terms of the parame- 




lag (event) 

FIG. 1: Autocorrelation function of the time series 
of signs of orders that result in immediate trades 
for the stock Shell traded at the London Stock Ex- 
change in the period May 2000 - December 2002, a 
total of 5.8 X 10''' events. 



ters, and test it against simulation results. Section IV 
discusses the properties of the A model, showing that 
it displays interesting temporal fluctuations. Section V 
compares the predictions to empirical evidence and dis- 
cusses the assumptions of the model in the context of real 
markets. In Section VI we discuss the possible broader 
implications. 



II. DESCRIPTION OF MODELS 

We develop a model with two variations, which we call 
the A model and the fixed N model. We first describe the A 
model, which is more realistic, but for which we have only 
simulation results. We then describe the fixed model, 
which is less realistic, but has the important advantage 
of being simpler, allowing us to obtain analytic results. 
Because of the simple nature of these results, they apply 
equally well to the A model. 

We first describe the A model. Let N{t) be the number 
of hidden orders at time t = l,2,...,T. At each time t 
generate a new hidden order with probability < A < 1 
if N{t) > 0, or probability one if N{t) — 0. Assign each 
new hidden order a random sign Si and an initial size 
Vi(t*) — LAv, where t* is the time when the hidden or- 
der is created, and L = 1, 2, ... is drawn from a Pareto 
distribution P{L) ~ with a > 0. The random 

variables L and Si are IID^. At each timestep t an ex- 



In the language of extreme value theory , the Pareto distribu- 
tion is just one example of a power law. A distribution f{x) is 
a power law with tail exponent a if there exists a slowly vary- 
ing function g{x) such that lim^,— ,cx) S{x)g(x) = Kx~'^, where K 
and a are positive constants. A function g(x) is a slowly varying 
function if for any t > liuix^ao g{tx) / g{x) = 1. A common 
example of a slowly varying function is logx, so in this sense 
the function x~°'\ogx is a power law. Thus, the term "power 



3 



isting hidden order i is chosen at random with uniform 
probabihty, and a volume Av of that order is removed, 
so that Vi{t + 1) — Vi{t) — Av. This generates a revealed 
order of volume Aw and sign xt = Si. A hidden order i 
is removed if Vi(t + 1) = 0. Thus, the number of hidden 
orders N{t) fluctuates in time, depending on fluctuations 
in arrival and removal. 

The fixed N model is the same, except that the number 
of hidden orders TV is kept fixed. Thus, if a hidden order 
is removed it is immediately replaced by a new one with 
a random sign and a new size. 

The main result of this paper is the calculation of the 
autocorrelation function of revealed order signs xt for the 
fixed N model. We show in the next section that the tail 
of the autocorrelation function asymptotically scales as 
While varying N affects the shape of the auto- 
correlation function for small r, providing a is held fixed, 
it does not affect its asymptotic scaling. Even though 
N{t) varies in the A model, the asymptotic behavior is 
independent of N{t), and so the asymptotic behavior of 
the autocorrelation function is the same. This is particu- 
larly convenient because it allows us to make a prediction 
in terms of observable quantities (see Section lv|) . 



pute the correct prefactor. 



A. Autocorrelation in probabilistic terms 

Under the convention that the signs of the revealed 
orders are xt = ±1, because of the symmetry between 
buying and selling E[xt] = and E[xf] = 1, where E 

Therefore the autocorrelation 
1. We can rewrite this as 



denotes the expectation, 
is simply pir) = E[xtXt+ 



E[xtXt+r] = ^ Q{L)E[xtXt+T\L\, 



(1) 



L=l 



where E[xtXt+r\L\ is conditioned on the hidden order 
that generated Xt having length L. Q{L) is the probabil- 
ity that a revealed order drawn at random comes from a 
hidden order of length L. Let q{T\L) be the probability 
that revealed orders at times t and time t-\-T came from 
the same hidden order, given that it has original length 
L. Because E[xtXt+r\ = if a;( and Xt+r came from 
different hidden orders, and E[xtXt+T\ = 1 if they came 
from the same hidden order, the conditional expectation 
can be rewritten 



III. ANALYTIC COMPUTATION FOR FIXED TV 
MODEL 



which implies 



E[xtXt+r\L] = q{T\L), 



(2) 



Because the hidden order arrival process is IID, it is 
possible to compute the autocorrelation of the fixed N 
model analytically. The basic idea of the computation is 
to understand the behavior of the autocorrelation con- 
ditioned on L, the initial length of the hidden order in 
units of the revealed order size Av, and then combine the 
results for different values of L. 

We first begin by giving a simple intuitive argument 
for the asymptotic scaling. The probability at any in- 
stant of time that a revealed order comes from a hidden 
order of length L is Q{L) (x Lp{L). This revealed order 
contributes to inducing a positive autocorrelation at lag 
T only if the revealed order r steps ahead comes from 
the same hidden order. In other words, in order to con- 
tribute to the autocorrelation function at lag t, a hidden 
order must be of length L > At, where A is a constant. 
Summing over all hidden orders gives an autocorrelation 
p{t) ~ J^^Q{L) ^ T^^"^^\ which is the main result of 
Eq. 1171 In the remainder of this section, we present a 
more detailed calculation, which also allows us to corn- 



law" refers not to a specific distribution, but to an equivalence 
class of distributions with the same asymptotic scaling proper- 
ties. It is clear from the calculations leading up to our main 
result, equation 1171 . that it is not necessary to assume that the 
distribution of volumes is strictly Pareto distributed; any power 
law distribution p{L) with a given tail exponent a will give the 
same asymptotic scaling for the autocorrelation function of re- 
vealed orders. 



piT) = J2QiL)q{r\L). 



(3) 



L=l 



To compute Q, we note that the number of revealed 
orders coming from hidden orders of length L is pro- 
portional to Lp{L), where p{L) is the probability that a 
hidden order has length L. To compute Q{L) we must 
properly normalize this by summing over L, 



Q{L) 



Lp{L) 



EZiLpiLY 



(4) 



This gives 



Pir) 



^ OO 

j-Y,Lq{r\L)p{L), 



(5) 



L = l 



where L is the average value of L. 

The conditional probability g(r|L) can be written 

w{L,t)p, 



(6) 



where w{L,t) is the probability that a given hidden order 
is still active after time r, and p is the probability that it 
will be selected for execution assuming it is still active. 
By assumption p ^ 1/N. 

Computing w{L,t) is more complicated: Let s be the 
number of revealed orders drawn from a given hidden 
order during the r — 1 timesteps between time t and time 
t + T, and let Pr^i{s < k) be the probability that s is less 
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than a given value k. Thus, for a hidden order that has 
length I at time t, the probability that it still exists at 
time t + T is Pr^i{s < I). For a hidden order with original 
length L, I is uniformly distributed with probability 1/L 
over the values 1, . . . , L. Thus we can express w{L, r) as 
a sum of probabilities, one for each possible value of I. 

w{L, t) ^ ^{Pr-i{s <L-1) + Pr-i{s < L - 2) + 

...+Pr-lis<l)). (7) 

The probabilities P,-_i(s < k) can be expressed as 
sums of binomial probabilities, corresponding to the pos- 
sible sequences with which a given hidden order generates 
k — 1 revealed orders. 



k-l 



Pr-l{s<k)=J2 

h=0 

Therefore 

L-2 j 
j=l h=0 



T- 1 
h 



T- 1 

h 



(8) 



(9) 



B. de Moivre-Laplace approximation 

The autocorrelation can now be computed using 
Eq. 13). However, since the sums of binomial coefficients 
are difficult to manage we will make use of the de Moivre- 
Laplace approximation |TH. For npq >> 1 one can ap- 
proximate 



y/2'Knpq 



exp 



(fc — npY 
2npq 



(10) 



As a consequence the sum of consecutive terms of a bi- 
nomial distribution can be approximated as 



erf ( ^^^^±1/? 1 - erf 
y/2npq 



E [ky^''-' - (11) 

ki — np — 1/2 



\/2npq 



where erf is the error function. 

By converting the sum to an integral, and letting s 
r — 1, equation becomes 



'^(■^ + i|^)-|^E 



P_ 
2L 



d 



l-L-2+1/2 


erf 1^ 


Jl/2 





f j^sp+ 1/2 
\^2sp{l-p)^ 

X - sp+l/2\ 

y/2sp{l~p) 



— erf 



— erf 



-sp- 1/2 
y2sp(l-p)^ 

-sp- 1/2 \ 
^2.sp{l-p)) 



dx, 



(12) 



For the approximation of the sum by the integral we use gives 
Ei=a /(^) — /a+i/2 f{^)dx. Performing the last integral 

I 



,(. + l|X).A(_exp(-^Jf£l!-^ 



) + 



y|v^Ml-rt(exp(- ^^ 



■)) 



(sp — l)erf ( 



1 



sp 



V^sp{l-p) 



) + {L~ 2)erf ( + ) + (i + - £)erf ( ^ ^ 



sp 



^2sp{l^p)' 

r 



v/2sp(l-p) 



))• 



(13) 



The sum over L in Eq. jsj can be approximated by the 
integral 



Pir) 



1 + 1/2 



q{r\LfMli dL. (14) 



Finally, we need to translate the domain of validity of 
the de Moivre-Laplace approximation into more relevant 
terms. The condition npq >> 1 in Eq. (|5J| becomes (t — 



l)p{l — p) >> 1. This leads to the condition 



T » 



N -1 



l~iV, 



(15) 



i.e. the approximation is valid as long as the lag is much 
greater than the number of hidden orders. Since the num- 
ber of hidden orders is fixed, the approximation is always 
valid for sufficiently large t. 



We have tested these calculations for the simple case 
in which all hidden orders have the same size Lq, i.e. 
p{L) = 5{L — Lq), where 6 is the Dirac delta function. 
This implies p{t) — q{T\Lo), so that Eq. gives a 

closed form expression for the autocorrelation function. 
As expected, the approximation always agrees very well 
for large values of r. The agreement is also good for small 
values of r when iV is small and Lq is sufficiently large. 

C. Pareto distribution 

We now consider the more realistic case that the hid- 
den order size L has a Pareto distribution 



where a > 1 is the tail exponent. In this case the integral 
of Eq. H14|) cannot be performed analytically. We can, 
however, give an analytical asymptotic expansion of the 
integral H14|) . The calculations detailed in the appendix 
make use of the saddle point approximation. The result 
is that the leading term of the asymptotic expansion of 
p{t) is given by the terms depending on erf functions in 
Eq. H13I) , and the autocorrelation function decays asymp- 
totically as 

P{r) r-("-i). (17) 

a 

This result indicates that the autocorrelation function 
decays as a power law with exponent 7 = a — 1. The 
number of hidden orders affects the prefactor, but does 
not affect the scaling exponent. Interestingly, when a — 2 
the prefactor is independent of N. When a < 2 it is 
a decreasing function of N, and when a > 2 it is an 
increasing function of N. The value a = 2 separates 
the regime where the size of hidden orders has infinite 
variance from the regime where the variance is finite^. 

Fig-Elcompares the autocorrelation function predicted 
by Eq. lfT7|l to a simulation for a = 1.5, = 1, = 5, 
and N = 50. For large values of r the match is excellent, 
both in terms of the slope and the size of the prefactor. 
For A^ = 1 the prediction matches the simulation across 
the entire range of t. As expected, when N increases the 
prediction deviates at small r, but still matches for large 
r. We have also checked the consequences of varying 
a and find that the prefactor behaves as predicted by 
Eq. ini). 

Note that we used T = 10^ samples to simulate the 
model and compare to theory. This is because for a = 1.5 
this is a strongly long-memory process, and the conver- 
gence is extremely slow. This will become an issue later 
on when we test the model against real data - even for 
very large sample sizes the error bars remain quite large. 



^ Note that Buldyrev et al. |^ found a similar formula in the 
context of structure in DNA sequences. 
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FIG. 2: (Color online) Autocorrelation of the fixed 
N model with a = 1.5, for A'^ = 1 (green circles), 
N = 5 (red squares) and A'^ = 50 (blue diamonds), 
based on a simulation with T = 10^. This is com- 
pared to the asymptotic predictions of Eq. 1171 . 
shown as dashed black lines. 



IV. LIQUIDITY FLUCTUATIONS OF THE 
A MODEL 

We now return to discuss the A model. As a reminder, 
this differs from the fixed A^ model analyzed so far in 
that the number of buffers N{t) is not fixed. Instead, 
new buffers are added with probability A when N(t) > 0, 
and probability 1 otherwise. For the mean of N{t) to 
remain bounded it is necessary that the rate of cre- 
ation of new orders equal the rate at which they are 
removed. This implies the model has a critical thresh- 
old where E[N(t)] 00. This can be simply computed 
as follows: Let n{t) be the total number of future re- 
vealed orders stored in all hidden orders at time i.e. 

n{t) = J^i'Ji ''^ii'^)/^'"- The average rate of change of 
n{t) is 

E[n{t + 1) - n{t)] = R{n{t))L - 1. 

The first term represents addition of a new hidden or- 
der, and the second term the removal of a revealed order 
at every timestep. The creation rate R{n{t)) = A when 
n{t) > and R{n{t)) = 1 otherwise. The average length 
of a new hidden order is L, which under the Pareto as- 
sumption is L = X^lLi -^(^) = ~ ct)- I'^ the limit 
where E[n{t)] is large it is a good approximation to say 
that n{t) is never zero, so that R{n{t)) = A. Setting 
E[n{t -f- 1) — 'n{t)] = implies the critical value Ac is 

Ac = 1/Z== (a-l)/a==7/a. (18) 

For the last equality we have made use of the fact that 7 
does not depend on A^ in Eq. (|17|l . which indicates that 
7 = a — 1 applies equally well to the A model as long as 
A < Ac (we have verified this in simulations). We also 
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FIG. 4: (Color online) Autocorrelation function of 
the number of active hidden orders in the A model 
for four different values of A, as shown in the inset. 
The dashed black lines have slope a — 1. 



FIG. 3: (Color online)The average number of hid- 
den orders as a function of the creation parameter 
A for a — 1.3 (red downward pointing triangles), 
a = 1.5 (black circles) and a = 1.7 (green upward 
pointing triangles). The dashed lines are the corre- 
sponding predicted critical values Xc — {a — l)/a. 



confirm the dependence of the critical behavior on a in 
Fig.El 

One of the interesting features of the A model is that 
it generates long-memory fluctuations in the number of 
active hidden orders. This is caused by positive feedback 
between the number of orders and the accumulation rate. 
This is because the average rate at which hidden orders 
are executed is 1/N{t). Thus when N{t) is larger than 
average, the rate at which active hidden orders are re- 
moved is lower than average, which tends to cause N{t) 
to increase above its average value. Such an increase is 
triggered by random fluctuations in which one or more 
particularly large orders are created; when these orders 
are finally removed, N{t) decreases. N{t) thus makes 
large and persistent fluctuations. The autocorrelation 
function has an asymptotic power law decay of the form 
Pn{t) ^ T^^' as shown in Fig. ^ From simulations, we 
flnd that 7 = a — 1 . 

For this model fluctuations in the number of hidden 
orders correspond to fluctuations in the time to execute 
an order. In economics this is one aspect of what is called 
liquidity^ which is a general term referring to the ease of 
execution of an order. One of the interesting properties 
of prices of economic time series is that they display what 
is commonly called clustered volatility, i.e. the diffusion 
rate of price changes is strongly autocorrelated in time, 
and in fact is a long- memory process '3,'^ . It has recently 
been shown that this is related to fluctuations in liquidity, 
in this case defined as the price response to an order of a 
given size ^3 • The fact that this kind of model predicts 
long-memory fiuctuations in another aspect of liquidity 
(the time to execute an order) may be related to the 
explanation of clustered volatility. 



V. TESTING THE PREDICTIONS 

Unfortunately, data comparing hidden orders and re- 
vealed orders are not widely available, which complicates 
the problem of testing this model. The only data set we 
know of that includes the kind of data that is needed for 
a proper test was used by Chan and Lakonishok H, Q to 
study the execution of customer orders at large brokerage 
firms. Unfortunately, they did not fit functional forms to 
the size distributions or test for long-memory, and we 
have not been able to obtain their data. Their study 
does make it clear that order splitting is very common, 
and suggests that the time scale on which order splitting 
occurs is sufficiently long to match the autocorrelations 
in order flow. 

We compare the predictions of the model to the data in 
two different ways. The first is based on computation of 
the scaling exponents, described in Section IVBl and the 
second is based on the properties of run length, described 
in Section IV CI Before presenting the first test, we must 
first review the market structure. 



A. Market structure and order distributions 

Although we have no transaction data with direct in- 
formation about hidden orders, we can perform an indi- 
rect test of the scaling relations predicted by the model 
which takes advantage of the market structure used in the 
New York Stock Exchange and the London Stock Ex- 
change. They both employ two parallel markets which 
provide alternative methods of trading, called the on- 
book or "downstairs" market, and the off-book or "up- 
stairs" market. In the LSE orders in the on-book mar- 
ket are placed publicly but anonymously and execution 
is completely automated. The off-book market, in con- 
trast, operates through a bilateral exchange mechanism, 
via telephone calls or direct contact of the trading parties. 



The anonymous nature of the on-book market faciHtates 
order sphtting, and it is clear that it is a common prac- 
tice. This is also supported by the fact that in our data 
set it is possible to track the on-book orders for individ- 
ual trading institutions, and the long-memory property 
of order flow is evident even for single institutions 19] . In 
contrast, off-book trading is based on personal relation- 
ships and order splitting is believed to be less frequent. 
This is because a series of orders of the same sign tend 
to gradually change the price in a direction that is unfa- 
vorable to the other party @, Q • 

Thus one might make the hypothesis that in the off- 
book market people just submit their orders rather than 
hiding them, while in the on-book market they hide their 
true orders and execute them through a series of revealed 
orders. While there is some truth in this hypothesis, it is 
not strictly true. When we examine sequences of off-book 
trades for individual institutions, we often see long runs 
of trades of the same sign, suggesting that order split- 
ting is also fairly common in the off-book market. Even 
though order-splitting is not common when trading with 
the same party, it is still possible to split a large order 
and trade it in the off-book market with many different 
parties. Thus the transactions in the off-book market 
have already undergone some order splitting, and it is 
not clear how well the distribution of transactions corre- 
sponds to that for hidden orders. 

Despite the caveats mentioned above, we will press 
forward with the hypothesis that off-book trades can 
be used as a proxy for hidden orders, and see how 
the predictions of our model match the empirical ob- 
servations of order splitting. To this end we select 20 
highly capitalized stocks traded at the London Stock Ex- 
change in the period May 2000 - December 2002. The 
stocks we analyzed are Astrazeneca ( AZN) , Bae Systems 
(BA.), Baa (BAA), BHP Bilhton (BLT), Boots Group 
(BOOT), British Sky Broadcasting Group (BSY), Dia- 
geo (DGE), Gus (GUS), Hilton Group (HG.), Lloyds Tsb 
Group (LLOY), Prudential (PRU), Pearson (iPSON), 
Rio Tinto (RIO), Rentokil Initial (RTO), Reuters Group 
(RTR), Sainsbury (SBRY), SheU Transport & Trading 
Co. (SHEL), Tesco (TSCO), Vodafone Group (VOD), 
and WPP Group (WPP). The number of trades for the 
combined group of stocks is 16.7 x lO*'; of these 11 x 10^ 
are on-book trades and 5.7 x 10^ are off-book trades. 

In Fig. [SI we show the empirical probability distribu- 
tions for the volume of trades in both the off-book and 
on-book markets in the London Stock Exchange. We 
show this for an aggregate of 20 heavily traded stocks and 
for the single stock Astrazeneca, which is typical of the 
stocks in the sample. This makes it clear that the tails die 
out more slowly in the off-book market. The largest trade 
sizes in the off-book market are more than a factor of ten 
larger than those in the on-book market; for Astrazeneca, 
for example, the largest orders are roughly four million 
shares in the off-book market vs. 200 thousand in the 
on-book market. Alternatively, to measure the decay of 
the tails more quantitatively, we assume the asymptotic 
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FIG. 5: Volume distributions of off-book trades 
(circles), on- book trades (diamonds) and the ag- 
gregate of both (squares). In (a) we show this for a 
collection of 20 different stocks, normalizing the vol- 
ume of each by the mean volume before combining, 
whereas (b) shows unnormalized values (in shares) 
for the stock Astrazeneca. The number of trades in 
each case is 11 x 10® (aggregate on-book), 5.7 x 10® 
aggregate off-book, 8.0 x 10^ (AZN on-book) and 
2.8 X 10^ (AZN off-book). The dashed black lines 
have the slope found by the Hill estimator (and are 
shown for the largest one percent of the data). 



relation for volume V is P{V > x) ^ x^", and estimate 
a using a Hill estimator applied to the largest one per- 
cent of the data ^fl] . For the aggregate data set this gives 
a — 1.59 for the off-book data, a — 2.90 for the on-book 
data, and a = 1.64 for the combined data'^. Similar 
values are computed for individual stocks, as shown in 
Fig. El The average values are a = 1.74 ± 0.23 for off- 
book, a = 4.2 ± 1.5 for on-book, and a = 1.36 ± 0.10 
overall. These results are consistent with the hypothesis 



^ The results for the combined data set are in rough agreement 
with those first reported for the NYSE and NASDAQ by Gopikr- 
ishnan et al. ^^'^ LSE and Paris by Gabaix et al. 
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FIG. 6: Scaling exponents a for the twenty stocks 
we study here, based on the hypothesis that the 
largest one percent of the trades V are described 
by the relation P(V > x) ~ x'". The stocks are 
arranged along the x axis in alphabetical order. The 
circles refer to off-book trades, the diamonds to on- 
book and the squares to the aggregate of both. For 
comparison we draw a dashed line for a = 1.5. 
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FIG. 7: The scaling exponents a for the twenty 
stocks we study here (with the hypothesis P{V > 
x) ~ plotted against the exponent 7 of 

the autocorrelation function (under the hypothe- 
sis p(t) ~ r '). The error bars shown are the 
95-percent confidence intervals of the Hill estima- 
tor, under the assumptions of IID errors and perfect 
Pareto scaling across the entire range of V. Both 
assumptions are highly optimistic. 



that order splitting is more common in the on-book mar- 
ket than it is in the off-book market. However, they also 
suggest that the separation between the styles of trading 
in these two markets is not absolute. They both show 
an approximate power law decay in their tails, although 
this decay is much steeper for the on-book market. 

Finally Fig. El shows that the exponent for the volume 
distribution of the aggregate of the on- and off- book 
trades is systematically smaller than the exponent for 
either of them by themselves. This is caused by the ag- 
gregation of two distributions: Mixing distributions with 
different scaling properties tends to fatten the tails. It 
indicates that one should be very careful in aggregating 
distributions^. 



B. Predicted vs. actual values of 7 

Taking the off-book market as a crude proxy for hid- 
den orders, we test the model by comparing 7 = a — 1 as 
predicted by Eq. (fTB|l to the value of 7 measured directly 
from the order signs. The scaling exponent 7 is measured 
by computing the Hurst exponent of the series of market 
order signs for each stock using the DFA method [25l |. 
and making use of the relation 7 = 2(1 — H). (This is 
much more accurate than computing the autocorrelation 
function directly) . We compare the predicted and actual 
values in Fig. [7| The average value of the scaling expo- 



* When power law distributions are combined the one with the low- 
est tail exponent determines the tail exponent of the aggregate. 
For a finite sample, however, there are often slow convergence 
effects as a function of sample size that can alter this conclusion. 



nent of the autocorrelation function is 7 = 0.57 ± 0.05. 
This can be compared either to 7 = 0.74 ± 0.23 based on 
the average value of a, or to 7 = 0.59 based on the a for 
the aggregate distribution. In either case the agreement 
is well within the error bars. (The error bars, which are 
based on the standard error of the mean of the 20 stock 
sample, are highly optimistic due to correlations within 
the sample and possibly also due to skewness and sys- 
tematic bias of the Hill estimates) . 

As a stronger test, one might hope that variations in 
measured values of a might predict variations in mea- 
sured values of 7. The model fails this test. Performing 
a regression of predicted vs. actual values gives a sta- 
tistically insignificant, slightly negative slope. There are 
several possible explanations for this: First, as we have 
already discussed, the off-book data may be a poor proxy 
for hidden orders. Second, the sample errors are very 
large, particularly for measuring a. The errors bars we 
have shown for a in Fig. are the 95-percent confidence 
intervals of the Hill estimator under the assumption that 
the data are IID and that the top one percent of the val- 
ues have converged to a perfect Pareto distribution. This 
is clearly far too optimistic. This can be seen by break- 
ing the data into subsamples; the variation from year to 
year is much larger than the error bars given by the Hill 
estimator. Even though our samples are large, the er- 
rors are still large because both volume and order signs 
are long- memory processes 19, 20], and averages gener- 
ally converge as T~'^^~^\ where H « 0.75 in both cases. 
In addition, the measured values of a have larger errors 
than those of 7 due to a strong tendency of the volume 
to trend upward, an effect that isn't easily removed by 
simple normalization. Gabaix et at have conjectured 
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that the exponent a for the volume distribution has a 
universal value a — 3/2; if true, this would imply that 
deviations from that value are purely statistical fluctua- 
tions. Finally, it is of course possible that our model is 
wrong, due to violations of the assumptions of the model. 
We list some of the possible problems in Section IVDI 

C. Run length 

Another test for comparing the models to data con- 
cerns the distribution of run lengths. A run is a series of 
revealed orders that are all of the same sign. In figure |S| 
we compare the run length distribution of the real order 
flow with a simulation of the both the fixed N model 
and the A model. In panel (a) we show the autocorrela- 
tion function of the sign of market orders for the stock 
Astrazeneca (AZN) and compare it with the autocorrela- 
tion of a simulation of the two models. The parameters 
are N — 24 and a — 1.63 for the fixed N model and 
N = 21.1, a = 1.63, and A = 0.38 for the A model. 
These parameters were chosen to give a best fit to the 
autocorrelation function of the real data. Both models 
are able to capture the asymptotic behavior of the auto- 
correlation function, but the fixed N model clearly un- 
derestimates the autocorrelation function for small lags. 
We can get a more detailed test by comparing the run 
length distribution of the models and the data, as shown 
in see panel (b) of figure ISJ. The figure shows that the 
A model is able to describe the run length distribution, 
whereas the fixed N model underestimate the run length 
probability for long runs. The A model appears to be a 
better candidate for describing real order flow. 

D. Review of assumptions 

Below we give a brief discussion of the assumptions of 
the model, as well as the circumstances under which this 
might alter the basic conclusions of the model. 

• Distribution of hidden orders. This has already 
been discussed in some detail above. Here we 
want to add that we have not addressed the pos- 
sible cause of the power law distribution of hid- 
den orders. One possibility (originally suggested 
by Levy and Solomon and developed by Gabaix et 
al. 0, ^) is that the hidden order size dis- 
tribution is in some way related to the power law 
distribution of the size of holdings of the largest 
market participants. 

• IID hidden order arrival. Strong autocorrelations 
in hidden order size or hidden order signs could 
afi^ect 7, particularly if these were strong enough to 
be long-memory. 

• Distribution of revealed orders. In reality, revealed 
orders do not have constant size. If their distribu- 
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FIG. 8: (Color online) (a) Autocorrelation func- 
tion of the market order sign for the stock As- 
trazeneca (black line) compared with the autocorre- 
lation function of a numerical simulation of the fixed 
A'^ model (red filled circles, parameters A = 24 and 
a = 1.63) and of the A model (empty blue circles, 
parameters a = 1.63, and A = 0.38 (which implies 
an average value of A^ = 21.1). (b) Probability 
distribution of the run length for real data and sim- 
ulations of the model. The symbols and parameters 
are the same as in panel (a). 

tion is sufficiently thin tailed we think the model 
should still be valid. Power law tails, however, 
might affect 7. 

• Aggregation of orders. In reality, there is a limited 
number of brokerage firms, and when they receive 
hidden orders with opposite signs within a suffi- 
ciently short period of time, they may cross such 
orders internally before they execute the remainder 
externally. This will reduce the amount of unexe- 
cuted volume and improve market clearing. In our 
model it has the potential to change the effective 
value of N. However, because of the independence 
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of the asymptotic scaling behavior on we do not 
think this will affect 7. 

• Feedback between order execution and order genera- 
tion. In our model we do not worry about whether 
revealed orders are actually executed. In reality 
many revealed orders may never be executed. In 
this case there may be feedback effects, i.e., if an 
order is not executed the hidden order size is not 
decreased, and consequently may result in the gen- 
eration of additional revealed orders when the agent 
tries again. We cannot say with certainty that such 
effects are not important. However, one piece of rel- 
evant evidence is that within statistical error the 
same scaling is observed for market orders, limit 
orders, and cancellations |19|. Since market orders 
are by definition executed immediately, this sug- 
gests that such feedback effects are of minor im- 
portance. 



VI. DISCUSSION 

We have presented and solved a rather idealized model 
of the long-memory of order flow which was designed 
to yield tractable results. As detailed in the preceding 
section, many of its assumptions are not strictly true. 
At the very least, though, it illustrates how two appar- 
ently disparate phenomena may be linked together, and 
makes quantitative predictions about their relationship. 
Because we lack the proper data to test the model, we 
have used an imperfect proxy to test the model. The 
model passes this test. However, it would be nice to do 
a more definitive test, based on a data set that more 
closely characterizes the dichotomy between hidden and 
revealed orders. Even if the model is not strictly true, 
the model could potentially be extended to include more 
realistic assumptions, such as a non-trivial distribution 
of revealed order sizes. 

The long-memory of order signs is interesting for its 
own sake, but it may also have more profound effects on 
other aspects of the market. The persistent autocorrela- 
tion function associated with a long-memory process im- 
plies a high degree of predictability by just constructing 
a simple linear time series model (see refs. 0,^3)- Since 
buy orders tend to generate a positive price response, and 
sell orders tend to generate a negative price response, all 
other things being equal this would translate into easily 
exploitable predictable movements in prices. In order to 
prevent this from happening, other features of the mar- 
ket have to adjust to compensate. Such features include 
the size of buy vs. sell orders, the volume of unexecuted 
orders at the best prices, and many other aspects of the 
market [2, 0, Market participants do not behave 

out of philanthropic motives; presumably these effects 
all come about due to the application of profit-making 
strategies. It is not at all obvious what these strategies 
are, and how they combine to eliminate this inefficiency. 



The market response to the long-memory of order flow 
is an interesting example of a self-organized collective 
phenomenon. It may be one of the causes of other im- 
portant properties of prices, such as the long-memory 
in their diffusion rate. We have demonstrated that the A 
model, which allows fluctuations in the number of hidden 
orders, automatically generates fluctuations in liquidity. 
This is known to affect price diffusion rates ICJJ . The in- 
dependence on the number of hidden orders, which was 
not obvious to us before doing the calculation, is a con- 
venient property of our result that makes it possible to 
test the model based on information that can be feasibly 
gathered. This is thus a falsifiable model. 
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APPENDIX 

In this appendix we evaluate the asymptotic behavior 
of the autocorrelation p(t) of Eq. (|14|l when the hidden 
order size L has a Pareto distribution of Eq. (|16|l . We 
split the integral of Eq. (|14|l in three parts and we set 
6 = p(l -p). 

The first contribution is 

This can be calculated explicitly. It is 
which asymptotically goes as 

This decay is very fast due to the exponential term. 
The second contribution is 

2 /-ooexpl- ^ 1 

This integral cannot be computed analytically. In order 
to get its asymptotic behavior for large s (i.e. large r) we 
make use of the saddle point approximation |22j| . To have 
an idea of the approximation let us consider the case in 
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which one has to calculate the asymptotic behavior of an 
integral of the type 



dx e^/(-) 



(23) 



for large values of N. If there exists a point xq in (a, b) 
which is a minimum for f{x), then we can expand f{x) 
around xq, yielding 

e^/(-) exp[7V(/(xo) + i/"(xo)(a; - xof)], (24) 
and we can compute the Gaussian integral 



dx e^^(-) 



2tt 



f"{xo) 



exp(7V/(a;o)). (25) 



The method can be applied also when the integral is not 
of the form 1)23(1 , given that the integrand can be written 
as exp(/(x, N)). In our case the integral in Eq. can 
be rewritten as 



exp 



3/2 



(L - 1 - spf 



2bs 



+ (a + 1) log a; 



dL. (26) 



By applying the saddle point approximation one easily 
gets for the integral the approximation 

V2^ exp (^^^ (sp)-("+i) , (27) 

and by putting also the prefactor we get for the second 
contribution 

(„_l)p(l_p)exp(^^^ (sp)-"^-^. (28) 

Thus the second contribution gives a power law behavior 
but with an exponent a rather than a — 1. 

The third contribution is the one depending on the 
three erf functions 

' (sp-l)erf( 1 ) + (L - 2)erf(- ^ 



2L 



3/2 



I2hs 



+ (l + 5p-L)erf( ^ f±'^ ) rfL.(29) 



After some algebraic manipulations we can rewrite this 
term as 



p(a- 1) 



+ 



2a 

vipi- 1) 



a- 1 



- 2 



erf 



1 — ps 



erf 



3/2 



(i — 1 — ps)erf ( ^ ^ 



\ V2bs 
I 



/2bs 



1/2 + sp 
V2b^ 



r-r dL, 



(30) 



where erf(xi,X2) — erf(a;2) — erf(a;i) and we have 
used the fact that L — a/{a ~ 1). The term in square 
brackets has asymptotic behavior 



Pja - 1) 
2a 



a — 1 



-(e 



p/2b _ ^p/b 



exp 



(-*) 



ps 



(31) 

and it is dominated by the exponential. The result is 
obtained by using the asymptotic expansion of the erf 



function. 

Finally we compute the asymptotic behavior of the in- 
tegral in Eq. ^U^, i.e. 



3/2 



(L — 1 — ps)erf 



1 — ps L — 1 — ps 



/26s 



■ dL. 

(32) 

It is convenient to perform first an integration by parts 
obtaining 



/ = — 



L 



1 + ps 



3/2 



V 1 



L 



a a 
I +ps 
a 



erf 



1 — ps L — \ ~ ps 



/2bs 



R\j2hs 

r 



exp 



V2&S 

(L — X — psy 

2hs 



I3/2 



dL. 



(33) 



The finite term decays exponentially to zero because of havior of the two integrals can be computed with the 
the properties of the error function. The asymptotic be- saddle point method in the same way as Eq. ((^ . Both 
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decay asymptotically as s and the final result is which coincides with Eq. (|17(l . 

Ph^I^^ I I ^ (34) 
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